Question Analysis Report

Generated: 2025-07-03T20:51:20.618182

Executive Summary

Dataset Size:
32,400 observations
Features:
478 total
Models Analyzed:
10 outcomes
Best R²:
0.895

Model Performance Summary

Outcome Adj. R² F-statistic F p-value AIC BIC RMSE N Significant Features High VIF Features Mean VIF Max VIF Sample Size
proportion_left_leaning 0.2104 0.2092 187.36 0.0000 214575.8 214969.9 6.6306 17 0 1.47 2.94 32,400
proportion_right_leaning 0.0179 0.0165 12.84 0.0000 105413.4 105807.5 1.2301 21 0 1.47 2.94 32,400
proportion_center_leaning 0.8953 0.8952 6016.63 0.0000 216105.0 216499.2 6.7889 17 0 1.47 2.94 32,400
proportion_high_quality 0.8938 0.8937 5919.79 0.0000 217852.5 218246.7 6.9745 25 0 1.47 2.94 32,400
proportion_low_quality 0.0577 0.0564 43.10 0.0000 162726.7 163120.8 2.9789 14 0 1.47 2.94 32,400
news_proportion_left_leaning 0.1143 0.1130 90.76 0.0000 278292.8 278686.9 17.7250 21 0 1.47 2.94 32,400
news_proportion_right_leaning 0.0187 0.0173 13.44 0.0000 202672.9 203067.0 5.5179 17 0 1.47 2.94 32,400
news_proportion_center_leaning 0.5295 0.5288 791.52 0.0000 307707.6 308101.8 27.9079 26 0 1.47 2.94 32,400
news_proportion_high_quality 0.5511 0.5504 863.38 0.0000 305112.1 305506.3 26.8122 23 0 1.47 2.94 32,400
news_proportion_low_quality 0.0299 0.0285 21.65 0.0000 232490.0 232884.1 8.7421 15 0 1.47 2.94 32,400

Correlation Matrix

Feature Importance

Regression Coefficients by Outcome

proportion_left_leaning (R² = 0.210, 27 features)

proportion_right_leaning (R² = 0.018, 27 features)

proportion_center_leaning (R² = 0.895, 27 features)

proportion_high_quality (R² = 0.894, 27 features)

proportion_low_quality (R² = 0.058, 27 features)

news_proportion_left_leaning (R² = 0.114, 27 features)

news_proportion_right_leaning (R² = 0.019, 27 features)

news_proportion_center_leaning (R² = 0.529, 27 features)

news_proportion_high_quality (R² = 0.551, 27 features)

news_proportion_low_quality (R² = 0.030, 27 features)

Model Family Comparisons

proportion_left_leaning

proportion_right_leaning

proportion_high_quality

proportion_news

num_citations

Multicollinearity Diagnostics

Interpretation: Variance Inflation Factor (VIF) measures multicollinearity.

proportion_left_leaning (High VIF: 0, Mean VIF: 1.47)

proportion_right_leaning (High VIF: 0, Mean VIF: 1.47)

proportion_center_leaning (High VIF: 0, Mean VIF: 1.47)

proportion_high_quality (High VIF: 0, Mean VIF: 1.47)

proportion_low_quality (High VIF: 0, Mean VIF: 1.47)

news_proportion_left_leaning (High VIF: 0, Mean VIF: 1.47)

news_proportion_right_leaning (High VIF: 0, Mean VIF: 1.47)

news_proportion_center_leaning (High VIF: 0, Mean VIF: 1.47)

news_proportion_high_quality (High VIF: 0, Mean VIF: 1.47)

news_proportion_low_quality (High VIF: 0, Mean VIF: 1.47)

Summary Statistics

Variable Type Mean Std Min Max N Missing
num_citations Citation Outcome 5.7652 5.1669 0.0000 46.0000 32,400 0
proportion_high_quality Citation Outcome 8.9662 21.3873 0.0000 100.0000 32,400 0
proportion_left_leaning Citation Outcome 1.6659 7.4564 0.0000 100.0000 32,400 0
proportion_right_leaning Citation Outcome 0.0819 1.2404 0.0000 50.0000 32,400 0
news_proportion_high_quality Citation Outcome 21.8282 39.9889 0.0000 100.0000 32,400 0
news_proportion_left_leaning Citation Outcome 4.7865 18.8207 0.0000 100.0000 32,400 0
news_proportion_right_leaning Citation Outcome 0.3746 5.5664 0.0000 100.0000 32,400 0
proportion_news Citation Outcome 10.7818 23.2094 0.0000 100.0000 32,400 0
turn_number Question/Response Feature 1.7057 2.0636 1.0000 39.0000 32,400 0
total_turns Question/Response Feature 2.5335 3.5807 1.0000 50.0000 32,400 0
question_length_chars_log Question/Response Feature -0.0000 1.0000 -3.8234 2.6358 32,400 0
question_length_words_log Question/Response Feature 0.0000 1.0000 -2.2585 2.9189 32,400 0
response_length_log Question/Response Feature -0.0000 1.0000 -7.0885 3.1220 32,400 0
response_word_count_log Question/Response Feature -0.0000 1.0000 -5.6188 2.9660 32,400 0
model_family_google Model Family 7,563 observations 23.3% - - 32,400 0
model_family_openai Model Family 11,168 observations 34.5% - - 32,400 0
model_family_perplexity Model Family 13,669 observations 42.2% - - 32,400 0

Technical Details

Regression Method: OLS_statsmodels

PCA Precomputed: True

PCA Used: True

Total Features: 47